create initial benchmark by evisdren · Pull Request #459 · entireio/cli

evisdren · 2026-02-22T21:53:28Z

Entire-Checkpoint: 4157af44db5f

cursor · 2026-02-22T21:53:33Z

PR Summary

Low Risk
Adds benchmark-only helpers and an integration benchmark; no production logic or data handling paths are modified.

Overview
Adds an integration benchmark that measures end-to-end latency of the hooks claude-code session-start subprocess, with sub-benchmarks that independently scale session state file count, git ref/branch count, tracked repo file count, and commit history depth.

Extends benchutil.BenchRepo with helpers to seed many branch refs and run git pack-refs --all, enabling more realistic ref-scanning performance testing.

^{Written by Cursor Bugbot for commit 8a0416e. Configure here.}

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

Comment @cursor review or bugbot run to trigger another review on this PR

cursor · 2026-02-22T21:55:53Z

cmd/entire/cli/integration_test/hook_bench_test.go

+		}
+
+		b.ReportMetric(float64(time.Since(start).Milliseconds()), "ms/op")
+	}


ReportMetric in loop only keeps last iteration's value

Medium Severity

b.ReportMetric overwrites any previously reported value for the same unit, so calling it inside the for range b.N loop means only the last iteration's latency is reported as ms/op, discarding all earlier measurements. The total time needs to be accumulated across all iterations and divided by b.N after the loop ends, then reported once with a single b.ReportMetric call. With -benchtime=5x, 4 out of 5 samples are silently thrown away.

Copilot

Pull request overview

This PR adds initial benchmarking infrastructure to measure the performance of the entire hooks claude-code session-start command under various scaling conditions. The benchmarks help identify performance bottlenecks as repositories grow in size, number of sessions, refs, and commit history.

Changes:

Added integration benchmark suite for hook session-start latency measurements
Extended benchutil with helpers to seed branches and pack refs for realistic testing scenarios

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
cmd/entire/cli/integration_test/hook_bench_test.go	New benchmark suite testing session-start hook performance across four scaling dimensions: session count, ref count, repo files, and commit history
cmd/entire/cli/benchutil/benchutil.go	Added SeedBranches and PackRefs helper methods to simulate repos with many git refs for benchmark testing

Copilot · 2026-02-22T21:56:30Z

cmd/entire/cli/integration_test/hook_bench_test.go

+	claudeProjectDir := b.TempDir()
+
+	b.ResetTimer()
+	for range b.N {


This benchmark uses the older for range b.N pattern instead of the newer b.Loop() pattern that was introduced in Go 1.24. The codebase is using Go 1.25.6 and all other benchmarks in benchutil_test.go consistently use b.Loop().

While for range b.N still works, using b.Loop() is the modern approach and provides better ergonomics. Consider updating this to match the pattern used throughout the codebase for consistency.

Copilot · 2026-02-22T21:56:31Z

cmd/entire/cli/integration_test/hook_bench_test.go

+		start := time.Now()
+
+		cmd := exec.Command(binary, "hooks", "claude-code", "session-start")
+		cmd.Dir = repo.Dir
+		cmd.Stdin = bytes.NewReader(stdinPayload)
+		cmd.Env = append(os.Environ(),
+			"ENTIRE_TEST_CLAUDE_PROJECT_DIR="+claudeProjectDir,
+		)
+
+		output, err := cmd.CombinedOutput()
+		if err != nil {
+			b.Fatalf("session-start hook failed: %v\nOutput: %s", err, output)
+		}
+
+		b.ReportMetric(float64(time.Since(start).Milliseconds()), "ms/op")


Using manual timing with time.Now() and b.ReportMetric alongside b.ResetTimer() creates competing timing mechanisms. When using b.Loop(), the testing framework automatically handles timing, eliminating the need for manual time.Now()/time.Since() calls and b.ReportMetric().

If you switch to b.Loop() as suggested, you can remove the manual timing code (lines 142, 156) and let the framework handle it automatically. The framework's timing is more accurate as it accounts for framework overhead.

create initial benchmark

8a0416e

Entire-Checkpoint: 4157af44db5f

Copilot AI review requested due to automatic review settings February 22, 2026 21:53

evisdren requested a review from a team as a code owner February 22, 2026 21:53

Copilot started reviewing on behalf of evisdren February 22, 2026 21:53 View session

cursor bot reviewed Feb 22, 2026

View reviewed changes

Copilot AI reviewed Feb 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

create initial benchmark#459

create initial benchmark#459
evisdren wants to merge 1 commit intomainfrom
optimize_claude_start

evisdren commented Feb 22, 2026

Uh oh!

cursor bot commented Feb 22, 2026 •

edited

Loading

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Feb 22, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 22, 2026

Uh oh!

Copilot AI Feb 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

evisdren commented Feb 22, 2026

Uh oh!

cursor bot commented Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Feb 22, 2026

Choose a reason for hiding this comment

ReportMetric in loop only keeps last iteration's value

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

cursor bot commented Feb 22, 2026 •

edited

Loading